51 research outputs found
Robot Learning and Execution of Collaborative Manipulation Plans from YouTube Cooking Videos
People often watch videos on the web to learn how to cook new recipes,
assemble furniture or repair a computer. We wish to enable robots with the very
same capability. This is challenging; there is a large variation in
manipulation actions and some videos even involve multiple persons, who
collaborate by sharing and exchanging objects and tools. Furthermore, the
learned representations need to be general enough to be transferable to robotic
systems. On the other hand, previous work has shown that the space of human
manipulation actions has a linguistic, hierarchical structure that relates
actions to manipulated objects and tools. Building upon this theory of language
for action, we propose a framework for understanding and executing demonstrated
action sequences from full-length, unconstrained cooking videos on the web. The
framework takes as input a cooking video annotated with object labels and
bounding boxes, and outputs a collaborative manipulation action plan for one or
more robotic arms. We demonstrate performance of the system in a standardized
dataset of 100 YouTube cooking videos, as well as in three full-length Youtube
videos that include collaborative actions between two participants. We
additionally propose an open-source platform for executing the learned plans in
a simulation environment as well as with an actual robotic arm
Composing Recurrent Spiking Neural Networks using Locally-Recurrent Motifs and Risk-Mitigating Architectural Optimization
In neural circuits, recurrent connectivity plays a crucial role in network
function and stability. However, existing recurrent spiking neural networks
(RSNNs) are often constructed by random connections without optimization. While
RSNNs can produce rich dynamics that are critical for memory formation and
learning, systemic architectural optimization of RSNNs is still an open
challenge. We aim to enable systematic design of large RSNNs via a new scalable
RSNN architecture and automated architectural optimization. We compose RSNNs
based on a layer architecture called Sparsely-Connected Recurrent Motif Layer
(SC-ML) that consists of multiple small recurrent motifs wired together by
sparse lateral connections. The small size of the motifs and sparse inter-motif
connectivity leads to an RSNN architecture scalable to large network sizes. We
further propose a method called Hybrid Risk-Mitigating Architectural Search
(HRMAS) to systematically optimize the topology of the proposed recurrent
motifs and SC-ML layer architecture. HRMAS is an alternating two-step
optimization process by which we mitigate the risk of network instability and
performance degradation caused by architectural change by introducing a novel
biologically-inspired "self-repairing" mechanism through intrinsic plasticity.
The intrinsic plasticity is introduced to the second step of each HRMAS
iteration and acts as unsupervised fast self-adaptation to structural and
synaptic weight modifications introduced by the first step during the RSNN
architectural "evolution". To the best of the authors' knowledge, this is the
first work that performs systematic architectural optimization of RSNNs. Using
one speech and three neuromorphic datasets, we demonstrate the significant
performance improvement brought by the proposed automated architecture
optimization over existing manually-designed RSNNs.Comment: 20 pages, 7 figure
PATO: Policy Assisted TeleOperation for Scalable Robot Data Collection
Large-scale data is an essential component of machine learning as
demonstrated in recent advances in natural language processing and computer
vision research. However, collecting large-scale robotic data is much more
expensive and slower as each operator can control only a single robot at a
time. To make this costly data collection process efficient and scalable, we
propose Policy Assisted TeleOperation (PATO), a system which automates part of
the demonstration collection process using a learned assistive policy. PATO
autonomously executes repetitive behaviors in data collection and asks for
human input only when it is uncertain about which subtask or behavior to
execute. We conduct teleoperation user studies both with a real robot and a
simulated robot fleet and demonstrate that our assisted teleoperation system
reduces human operators' mental load while improving data collection
efficiency. Further, it enables a single operator to control multiple robots in
parallel, which is a first step towards scalable robotic data collection. For
code and video results, see https://clvrai.com/patoComment: Website: https://clvrai.com/pat
- …